15 research outputs found
Mine 'Em All: A Note on Mining All Graphs
International audienceWe study the complexity of the problem of enumerating all graphs with frequency at least 1 and computing their support. We show that there are hereditary classes of graphs for which the complexity of this problem depends on the order in which the graphs should be enumerated (e.g. from frequent to infrequent or from small to large). For instance, the problem can be solved with polynomial delay for databases of planar graphs when the enumerated graphs should be output from large to small but it cannot be solved even in incremental-polynomial time when the enumerated graphs should be output from most frequent to least frequent (unless P=NP)
Lifted Inference with Linear Order Axiom
We consider the task of weighted first-order model counting (WFOMC) used for
probabilistic inference in the area of statistical relational learning. Given a
formula , domain size and a pair of weight functions, what is the
weighted sum of all models of over a domain of size ? It was shown
that computing WFOMC of any logical sentence with at most two logical variables
can be done in time polynomial in . However, it was also shown that the task
is \texttt{#}P_1-complete once we add the third variable, which inspired the
search for extensions of the two-variable fragment that would still permit a
running time polynomial in . One of such extension is the two-variable
fragment with counting quantifiers. In this paper, we prove that adding a
linear order axiom (which forces one of the predicates in to introduce a
linear ordering of the domain elements in each model of ) on top of the
counting quantifiers still permits a computation time polynomial in the domain
size. We present a new dynamic programming-based algorithm which can compute
WFOMC with linear order in time polynomial in , thus proving our primary
claim
Fast Construction of Relational Features for Machine Learning
Katedra kybernetik
Lifted Algorithms for Symmetric Weighted First-Order Model Sampling
Weighted model counting (WMC) is the task of computing the weighted sum of
all satisfying assignments (i.e., models) of a propositional formula.
Similarly, weighted model sampling (WMS) aims to randomly generate models with
probability proportional to their respective weights. Both WMC and WMS are hard
to solve exactly, falling under the -hard complexity class.
However, it is known that the counting problem may sometimes be tractable, if
the propositional formula can be compactly represented and expressed in
first-order logic. In such cases, model counting problems can be solved in time
polynomial in the domain size, and are known as domain-liftable. The following
question then arises: Is it also the case for weighted model sampling? This
paper addresses this question and answers it affirmatively. Specifically, we
prove the domain-liftability under sampling for the two-variables fragment of
first-order logic with counting quantifiers in this paper, by devising an
efficient sampling algorithm for this fragment that runs in time polynomial in
the domain size. We then further show that this result continues to hold even
in the presence of cardinality constraints. To empirically verify our approach,
we conduct experiments over various first-order formulas designed for the
uniform generation of combinatorial structures and sampling in
statistical-relational models. The results demonstrate that our algorithm
outperforms a start-of-the-art WMS sampler by a substantial margin, confirming
the theoretical results.Comment: 47 pages, 6 figures. An expanded version of "On exact sampling in the
two-variable fragment of first-order logic" in LICS23, submitted to AIJ.
arXiv admin note: substantial text overlap with arXiv:2302.0273
Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search
We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids
A Note on Restricted Forms of LGG
We study existence of a restricted least general generalization (LGG) with the property that LGGs of clauses from a pre-fixed set belong to this set. We show that there is no such LGG even in simple sets of clauses such as bounded-size clauses or treewidth-1 clauses